Python Social Media Analytics by Siddhartha Chatterjee & Michal Krystyanczuk
Author:Siddhartha Chatterjee & Michal Krystyanczuk [Chatterjee, Siddhartha & Krystyanczuk, Michal]
Language: eng
Format: epub
Tags: Computers, Data Analytics, COM018000 - COMPUTERS / Data Processing, Data Science
ISBN: 9781787121485
Google: 3J4MMQAACAAJ
Publisher: Packt Publishing
Published: 2017-07-28T20:24:31.911443+00:00
Summary
Sentiment analysis and entity recognition are two powerful social media analytics techniques to get context around user content. Sports being a sentiment and emotion inciting subject among audiences, for this chapter the dataset we used were tweets using the Twitter API on the English Football Premier League. We used the Twitter REST and Streaming API to collect the data and also applied basic cleaning explained in Chapter 2, Harnessing Social Data - Connecting, Capturing, and Cleaning) and new cleaning methods such as device detection from Twitter API metadata. Sentiment Analysis allows us to categorize text into positive, negative, and neutral categories. We also learnt that there are limitations to sentiment analysis with accuracy, especially in ambiguous expressions. We used the VADER (Valence Aware Dictionary for Sentiment Reasoning) module from NLTK for sentiment analysis. We also saw that we can build our own sentiment analysis algorithm through machine learning on test and train set datasets. Accuracy of custom sentiment analysis depends heavily on the quality and size of the example or training set. Building and applying our own sentiment analyzer using the Python Scikit Learn library we got an accuracy of around 73%. We applied the cross-validation, confusion matrix, K-Fold, and precision/recall techniques to evaluate the performance of our algorithm.
Entity recognition allows us to categorize textual data into categories such as name, place, organization, and others. This is an efficient method to get a broad understanding on large amounts of social media conversations. We used a Java-based popular entity recognition module, Stanford NER. Using the library on our football dataset allowed us to extract the most frequent clubs, locations, and names being mentioned. We combined Sentiment Analysis and Entity recognition on the chosen dataset by computing sentiments on the entity club detected. Chelsea, Arsenal, and Liverpool being among the most frequent clubs as entities, the application of sentiment analysis on them gave us some insights.
In the next chapter, we will explore data from YouTube to analyze campaigns.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Deep Learning with Python by François Chollet(12587)
Hello! Python by Anthony Briggs(9926)
OCA Java SE 8 Programmer I Certification Guide by Mala Gupta(9800)
The Mikado Method by Ola Ellnestam Daniel Brolund(9786)
Dependency Injection in .NET by Mark Seemann(9347)
A Developer's Guide to Building Resilient Cloud Applications with Azure by Hamida Rebai Trabelsi(9333)
Hit Refresh by Satya Nadella(8829)
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8309)
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(7787)
Test-Driven iOS Development with Swift 4 by Dominik Hauser(7771)
Grails in Action by Glen Smith Peter Ledbrook(7704)
The Kubernetes Operator Framework Book by Michael Dame(7690)
The Well-Grounded Java Developer by Benjamin J. Evans Martijn Verburg(7566)
Exploring Deepfakes by Bryan Lyon and Matt Tora(7487)
Practical Computer Architecture with Python and ARM by Alan Clements(7405)
Implementing Enterprise Observability for Success by Manisha Agrawal and Karun Krishnannair(7388)
Robo-Advisor with Python by Aki Ranin(7362)
Building Low Latency Applications with C++ by Sourav Ghosh(7265)
Svelte with Test-Driven Development by Daniel Irvine(7231)
